A Study of Sequence Clustering on Protein’s Primary Structure using a Statistical Method
نویسندگان
چکیده
The clustering of biological sequences into biologically meaningful classes denotes two computationally complex challenges: the choice of a biologically pertinent and computable criterion to evaluate the clusters homogenity, and the optimal exploration of the solution space. Here we are analysing the clustering potential of a new method of sequence similarity based on statistical sequence content evaluation. Applying on the same data the popular CLUSTAL W method for sequence similarity we contrasted the results. The analysis, computational efficiency and high accuracy of the results from the new method is encouraging for further development that could make it an appealing alternative to the existent methods.
منابع مشابه
In Silico Analysis of Primary Sequence and Tertiary Structure of Lepidium Draba Peroxidase
Peroxidase enzymes are vastly applicable in industry and diagnosiss. Recently, we introduced a new kind of peroxidase gene from Lepidium draba (LDP). According to protein multiple sequence alignment results, LDP had 93% similarity and 88.96% identity with horseradish peroxidase C1A (HRP C1A). In the current study we employed in silico tools to determine, to which group of peroxidase enzymes LDP...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملPhylogenetic Analysis of Beta-Glucanase Producing Actinomycetes Strain TBG-CH22 - A Comparison of Conventional and Molecular Morphometric Approach
Actinomycetes are inexhaustible producers of commercially valuable metabolites, are continually screened for beneficial compounds. The taxonomic and phylogenetic study of novel actinomycetes strains are mostly based on conventional methods and primary DNA structure of 16s rRNA. Although 16s rRNA sequence is well accepted in phylogeny studies, its secondary structures have not been widely used. ...
متن کاملA Comparative Study of Some Clustering Algorithms on Shape Data
Recently, some statistical studies have been done using the shape data. One of these studies is clustering shape data, which is the main topic of this paper. We are going to study some clustering algorithms on shape data and then introduce the best algorithm based on accuracy, speed, and scalability criteria. In addition, we propose a method for representing the shape data that facilitates and ...
متن کاملPhylogenetic Analysis of Beta-Glucanase Producing Actinomycetes Strain TBG-CH22 - A Comparison of Conventional and Molecular Morphometric Approach
Actinomycetes are inexhaustible producers of commercially valuable metabolites, are continually screened for beneficial compounds. The taxonomic and phylogenetic study of novel actinomycetes strains are mostly based on conventional methods and primary DNA structure of 16s rRNA. Although 16s rRNA sequence is well accepted in phylogeny studies, its secondary structures have not been widely used. ...
متن کامل